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ABSTRACT 

We present the first principal component analysis (PCA) applied to a sample of 119 
Spitzer Infrared Spectrograph (IRS) spectra of local ultraluminous infrared galaxies 
(ULIRGs) at z < 0.35. The purpose of this study is to objectively and uniquely char- 
acterise the local ULIRG population using all information contained in the observed 
spectra. We have derived the first three principal components (PCs) from the covari- 
i ' ance matrix of our dataset which account for over 90% of the variance. The first PC is 

characterised by dust temperatures and the geometry of the mix of source and dust. 
The second PC is a pure star formation component. The third PC represents an anti- 
correlation between star formation activity and a rising AGN. Using the first three 
! PCs, we are able to accurately reconstruct most of the spectra in our sample. Our work 

shows that there are several factors that are important in characterising the ULIRG 
population, dust temperature, geometry, star formation intensity, AGN contribution, 
etc. We also make comparison between PCA and other diagnostics such as ratio of the 
C> 6.2 ftra PAH emission feature to the 9.7 \im silicate absorption depth and other observ- 

ables such as optical spectral type. An electronic version of the first three PCs of the 
local ULIRG population is available at http: / / astronomy.sussex.ac.uk/~lw94/PCA/| 



OO 



Key words: galaxies: statistics - infrared: galaxies. 



1 INTRODUCTION luminosity (eg Rowan-Robinson & Crawford 1989; Smith 

TT1 , . . , , , . /TTTTT .^ , ,. , - 1998; Farrah et al. 2003, 2007; Ptak et al. 2003; Frances- 

Ultralummous infrared galaxies (ULIRGs), discovered from chin . ^ &y ^ ^ ^ ^ ^ ^ 2003j 

ground-based mfrared photometry m the 1970s (Rieke & „„„_, . , . , ,. . . . . ., , 

° „,„,;, , 2007). A basic evolutionary picture is that two or possibly 

Low 1972), are usually denned to be galaxies with bolomet- . , . , , . „. , , „, 

' . 12 more gas-rich spiral galaxies collide with each other; I he 

nc luminosities from 8 — 1000 urn > 10 La. Over the last , . „ , r .. 

collision triggers bursts of star formation in centrally con- 
decade or so, we have learnt from optical and near-infrared , , , , . , , „ , , , . 

centrated and compressed interstellar gas and active galactic 

imaging that ULIRCs are mostly interacting or merging . . f.^^ ,. ., „ „ . , lnr ,„ c , „ 

, , J. , „ »,., „„„„ , . f „ nuclei (AGN) activity (Mihos & Hernquist 1996; Sanders & 

systems (e.g. Armus, Heckman & Miley 1987; Melmck & A(r . , , ln ' , lnne T , . „ , „ _ . , 

„. , , , v „r, . .' „ „ T , Mirabel 1996; Moorwood 1996; Lonsdale, Farrah & Smith 

Mirabel 1990; Hutchmgs & Neft 1991; Clements et al. 1996; „„„,, , , . , „. , . , , 

' _° „ ' _ , 2007). ihe merged galaxy then turns into an elliptical and 

Murphy et al. 1996; Surace et al. 2000; Farrah et al. 2001; , ; . 6 f. , „ c „ v 

... ... o n i t-. , ,■ • perhaps eventually an optical QSO. 

Veilleux, Kim & Sanders 2002). from spectral energy distri- 
bution (SED) modelling, optical, UV and mid-infrared spec- With the advent of Infrared Space Observatory (ISO; 
troscopic studies, X-ray and radio imaging, it seems that the Kessler et al. 1996) and Infrared Spectrograph (IRS; Houck 
power source in these galaxies is usually some combination of et aL 2004 ) on Spitzer Space Telescope (Werner et al. 2004), 
star formation and mass accretion onto the central black hole mid-infrared spectroscopy has become a powerful tool in 
with the former being the dominant contributor and the lat- studying the nature of ULIRGs. Broad polycyclic aromatic 
ter usually increases its importance as a function of infrared hydrocardons (PAHs) features, the strongest of which are 

located at 6.2, 7.7, 8.6, 11.2 and 12.7 fim, are ubiquitous 

in normal galaxies and starburst systems with moderately 

* E-mail: lingyu.@sussex.ac.uk intense UV radiation but weak or absent near an AGN. It 
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indicates that the strength of PAHs is correlated with the 
origin of activity and thus can be used to separate a buried 
AGN from starburst. Another tool to disentangle the two 
energy sources is the optical depths of the silicate dust ab- 
sorption features at 9.7 and 18 fim which is associated with 
source geometry (Imanishi et al. 2007). High ionization fine 
structure lines also allow us to assess the dominant ioniza- 
tion mechanism. For example, the presence of the [NeV] line 
at 14.3 /im requires photons of energy > 97 eV and therefore 
indicates the presence of an AGN. Armus et al. (2007) found 
that ULIRGs selected from the IRAS Bright Galaxy Sample 
(Soifer et al. 1987) have a large range in spectral slope, sili- 
cate optical depths and PAH strengths indicative of a diverse 
and complex population. Using diagnostics such as the ratio 
of high- to low- excitation mid-infrared emission lines, the 7.7 
\jm line-to-continuum ratio and the 6.2 fim PAH emission 
and the 9.7 \im silicate feature, most ULIRGs are found to 
be starburst-dominated with at least 50% have both star- 
burst and AGN activity (eg Genzel et al. 1998; Rigopoulou 
et al. 1999; Lutz et al. 1999; Spoon et al. 2007). 

However, there are some difficulties in using the afore- 
mentioned diagnostics, eg the mixing of the 7.7 \im feature 
with the adjacent 8.6 \xm feature and the difficulty of ac- 
curately defining the underlying continuum, especially for 
strongly obscured sources. Different groups measure PAHs 
in different ways and sometimes their estimates can differ 
by a factor of two. In addition, different diagnostics often 
lead to different assessments of the dominant power source. 
Principal component analysis (PCA) has been used for spec- 
tral classification of optical galaxies with great success (e.g. 
Connolly et al. 1995; Bromley et al. 1998; Folkes et al. 1999; 
Madgwick et al. 2002, 2003; de Lapparent et al. 2003). In 
this paper, we present the first PCA of mid-infrared spectra 
which uses all information contained in the observed spectra 
while many traditional methods only use a few pixels of the 
whole spectrum and thus may lose important information. 

This paper is organised as follows. In Section 2, we 
briefly describe our sample of mid-infrared spectra. In Sec- 
tion 3, we introduce the PCA method, its implementation 
and the most important principal components (PCs) derived 
from the covariance matrix. The stability of the mean spec- 
trum and the PCs is then investigated using a bootstrap 
resampling technique. In Section 4, we present eigenvector 
decomposition of a few selected spectra and discuss the ef- 
fect of each PC. Our sample of spectra is then divided into 
eight types according to the sign of the contribution from 
each PC. Comparison with other diagnostics and other ob- 
servables are presented in Section 5. Finally, conclusions and 
discussions are presented in Section 6. 



2 THE DATA 

2.1 Sample Selection 

We selected our sample from two observing programs; those 
ULIRGs observed as part of the IRS Guaranteed Time pro- 
gram (Armus et al 2006; Spoon et al 2007; Farrah et al 
2007; Desai et al 2007), and those observed by Imanishi et 
al (2007). Both the IRS-GTO and Imanishi samples were 
selected from the IRAS Uy and 2Jy surveys and together 
comprise virtually all known ULIRGs at z < 0.5. We im- 
posed an upper redshift cut of z — 0.35 to ensure we sample 



approximately the same wavelength range for each object, 
and removed a further eight objects as they have poor qual- 
ity data in the longer wavelength IRS modules due to failed 
peak-ups or other observing difficulties. The resulting sam- 
ple comprises 119 objects, listed in Table [T] 

2.2 Observations 

All objects were observed with both orders of the Short-Low 
(SL) and Long-Low (LL) modules of the IRS. Observations 
were performed in staring mode. The targets were acquired 
by performing a high-accuracy IRS peak-up with the blue 
array on the target itself, or by peaking up on a nearby Two 
Micron All-Sky Survey (2MASS; Skrutskie et al. 2006) star 
and offsetting to the target. Each galaxy was observed at two 
nod positions within each of the IRS orders. The resulting 
spectra have a spectral resolution of R ~ 80 over 5 38 /im. 

2.3 Data Reduction 

The data were processed through the Spitzer Science Cen- 
ter's pipeline software (version 18.7), which performs stan- 
dard tasks such as ramp fitting and dark current subtrac- 
tion, and produces Basic Calibrated Data (BCD) frames. 
Starting with these frames, we removed rogue pixels us- 
ing the IRSCLEAN tooQ and campaign-based pixel masks. 
The individual frames at each nod position were then com- 
bined into a single image using the SMART software (Hig- 
don et al. 2010). Sky background was removed from each 
image by subtracting the image for the same object taken 
with the other nod position (i.e. 'nod-nod' sky subtrac- 
tion). One-dimensional spectra were then extracted from 
the images using 'optimal' extraction and default param- 
eters (LeBouteiller et al. 2010). This procedure results in 
separate spectra for each nod and for each order. The spec- 
tra for each nod were inspected; features present in only one 
nod were treated as artefacts and removed. The two nod 
positions were then combined. The first and last 4 pixels on 
the edge of each order, corresponding to regions of decreased 
sensitivity on the array, were then removed, and the spec- 
tra in different orders merged, to give the final spectrum for 
each object. 



3 PRINCIPAL COMPONENT ANALYSIS 
(PCA) 

3.1 A brief introduction of PCA 

PCA is a non-parametric tool, often used to reduce a com- 
plex dataset to a simpler structure and searching for cor- 
relations. A primary benefit of PCA analysis arises from 
quantifying the importance of each dimension for describing 
the variability of a data set. 

Suppose we have M spectra. The spectrum of the ith 
galaxy, a sequence of TV numbers (f{ , ■ ■ • , /Ar), can be treated 
as a vector in an TV-dimensional space, where TV is the num- 
ber of spectral channels. We can then form a covariance 
matrix of all the spectra, 

1 The IRSCLEAN package can be downloaded from 
://ssc. spitzer. caltech. edu 



http 



PC A of the Spitzer IRS spectra of ULIRGs 3 

Table 1. Our sample of 119 nearby ULIGRs at redshift z ^ 0.35. 



Name Redshift Name Redshift Name Redshift Name Redshift 



IRAS 


00091-0738 





.12 


IRAS 


06009-7716 





.12 


IRAS 


12514+1027 





.32 


IRAS 


19254-7245 


0. 


.06 


IRAS 


00183-7111 





.33 


IRAS 


06035-7102 





.08 


IRAS 


13218+0552 





.20 


IRAS 


19297-0406 


0. 


.09 


IRAS 


00188-0856 


0. 


.13 


IRAS 


06206-6315 





.09 


IRAS 


13335-2612S 


0. 


.12 


IRAS 


19458+0944 


0. 


.10 


IRAS 


00199-7426 


0. 


.10 


IRAS 


06361-6217 





.16 


IRAS 


13342+3932 


0. 


.18 


IRAS 


20037-1547 


0. 


.19 


IRAS 


00275-0044 





.24 


IRAS 


07145-2914 





.01 


IRAS 


13352+6402 





.24 


IRAS 


20087-0308 


0. 


.11 


IRAS 


00275-2859 





.28 


IRAS 


07598+6508 





.15 


IRAS 


13451+1232 





12 


IRAS 


20100-4156 


0. 


13 


IRAS 


00397-1312 


0. 


.26 


IRAS 


08559+1053 





.15 


IRAS 


13509+0442 


0. 


.14 


IRAS 


20414-1651 


0. 


.09 


IRAS 


00406-3127 





.34 


IRAS 


08572+3915 





.06 


IRAS 


13539+2920 


0. 


.11 


IRAS 


20551-4250 


0. 


.01 


IRAS 


00456-2904SW 





.11 


IRAS 


09039+0503 





.12 


IRAS 


14060+2919 





12 


IRAS 


21208-0519N 


0. 


13 


IRAS 


01003-2238 





.12 


IRAS 


09116+0334 





.15 


IRAS 


14070+0525 





.26 


IRAS 


21272+2514 


0. 


15 


IRAS 


01166-0844SE 


0. 


.12 


IRAS 


09539+0857 





.13 


IRAS 


14252-1550 


0. 


15 


IRAS 


23060+0505 


0. 


.17 


IRAS 


01199-2307 


0. 


.16 


IRAS 


10091+4704 





.25 


IRAS 


14348-1447 


0. 


.08 


IRAS 


23128-5919 


0. 


.04 


IRAS 


01298-0744 





.14 


IRAS 


10378+1109 





.14 


IRAS 


14378-3651 





07 


IRAS 


23230-6926 


0. 


.11 


IRAS 


01355-1814 





.19 


IRAS 


10485- 1447W 





.13 


IRAS 


15001+1433 


0. 


16 


IRAS 


23253-5415 


0. 


13 


IRAS 


01388-4618 


0. 


.09 


IRAS 


10494+4424 





.09 


IRAS 


15206+3342 


0. 


.12 


IRAS 


23327+2913 


0. 


.11 


IRAS 


01494-1845 


0. 


.16 


IRAS 


10565+2448 





.04 


IRAS 


15225+2350 


0. 


.14 


IRAS 


23498+2423 


0. 


21 


IRAS 


01569-2939 





.14 


IRAS 


11038+3217 





.13 


IRAS 


15250+3609 





.06 


IRAS 


23578-5307 


0. 


12 


IRAS 


02054+0835 


0. 


.34 


IRAS 


11095-0238 





.11 


IRAS 


15462-0450 


0. 


.10 


IRAS 


3C273 


0. 


.16 


IRAS 


02113-2937 


0. 


.19 


IRAS 


11130-2659 





.14 


IRAS 


16300+1558 


0. 


.24 


IRAS 


Arp220 


0. 


.02 


IRAS 


02455-2220 


0. 


.30 


IRAS 


11223-1244 





.20 


IRAS 


16334+4630 


0. 


.19 


IRAS 


Mrkl014 


0. 


.16 


IRAS 


02530+0211 





.03 


IRAS 


11387+4116 





.15 


IRAS 


16468+5200E 





15 


IRAS 


Mrk231 


0. 


.04 


IRAS 


03000-2719 





.22 


IRAS 


11506+1331 





.13 


IRAS 


16487+5447 





.10 


IRAS 


Mrk273 


0. 


.04 


IRAS 


03158+4227 


0. 


.13 


IRAS 


11582+3020 





.22 


IRAS 


17028+5817 


0. 


.11 


IRAS 


Mrk463 


0. 


05 


IRAS 


03250+1606 


0. 


.13 


IRAS 


12018+1941 





.17 


IRAS 


17044+6720 


0. 


.13 


IRAS 


NGC6240 


0. 


.02 


IRAS 


03521+0028 





.15 


IRAS 


12032+1707 





.22 


IRAS 


17068+4027 


0. 


18 


IRAS 


PG1119+120 


0. 


05 


IRAS 


03538-6432 





.30 


IRAS 


12071-0444 





.13 


IRAS 


17179+5444 





15 


IRAS 


PG1211+143 


0. 


08 


IRAS 


04103-2838 


0. 


.12 


IRAS 


12112+0305 





.07 


IRAS 


17208-0014 


0. 


.04 


IRAS 


PG1351+640 


0. 


.09 


IRAS 


04114-5117 


0. 


.12 


IRAS 


12127-1412NE 





.13 


IRAS 


17463+5806 


0. 


.31 


IRAS 


PG2130+099 


0. 


.06 


IRAS 


04313-1649 





.27 


IRAS 


12205+3356 





.26 


IRAS 


18030+0705 


0. 


15 


IRAS 


UGC5101 


0. 


.01 


IRAS 


05189-2524 





.04 


IRAS 


12359-0725 





.14 


IRAS 


18443+7433 


0. 


13 











C ij = -^^ t flfi,l^i,3^N. (1) 

The covariance matrix can be diagonalized by the ma- 
trix of its orthogonal eigenvectors (or eigenspectrum) E — 
[ei,e 2) ■ • • ,ejv], 

C = EDE T , (2) 

where D is a diagonal matrix and its diagonal terms are the 
eigenvalues of the corresponding eigenvectors. In the lan- 
guage of PCA, the eigenvectors, each of which is a linear 
combination of the original spectra, are principal compo- 
nents (PCs). In addition, PCs are ordered by their impor- 
tance, ie the percentage of variance they account for. In 
some cases, only a few PCs are needed to describe the data 
which is why PCA is used to reduce the dimensionality of 
a complex dataset. The underlying assumption of PCA is 
that eigenvectors associated with large variance reveal im- 
portant structures, while those associated with small eigen- 
values represent noise. However, PCA does have limitations 
because it assumes that the variance is sufficient to repre- 
sent the data. Therefore, PCA only applies to multivariate 
Gaussian distributions. 

3.2 Implementation of PCA and the derived PCs 

First of all, we need to de-redshift the observed spectrum 
to its rest frame. We have chosen a rest-frame wavelength 
range from 5.37 - 23.7 fim which is covered by all spectra in 
our dataset and then divided it into 268 equally-spaced bins 



in linear wavelength space. Each spectrum is normalised so 
that the mean flux over the whole wavelength range is unity. 
The mean spectrum of our sample is then subtracted off from 
each spectrum. We form a covariance matrix of all mean- 
subtracted spectra and then derive the eigenvectors/PCs. 
The mean spectrum and the first four PCs are shown in 
Fig. [T] We emphasise that the PCs represent the difference 
between the observed spectra and the mean spectrum of the 
sample. 

In the mean spectrum, we can clearly see features such 
as broad PAH emissions, neon fine-structure lines and broad 
absorption from amorphous silicate at 9.7 and 18 /im. The 
first component PCI is characterised by weak PAH emis- 
sions, weak neon lines, fairly strong silicate absorption and a 
steep spectral slope. In both PC2 and PC3, strong PAHs are 
the dominant features. Although PC4 is shown in here, we 
do not interpret PC4 as a proper PC because the distribu- 
tion of our sample in the PCI - PC4 plane is non-Gaussian. 
For now, PC4 is interpreted as extra features in the spectra 
of quasars (more discussion in Section 5.5). 

One caveat to bear in mind is that the derived PCs 
might be different in different set-ups of the analysis. In 
addition to the set-up described above (referred to as (SI) 
hereafter) we consider two more scenarios: (S2) Instead of 
re-sampling the spectra to an array with equal width in A 
and subtracting the mean spectrum off all spectra, we re- 
sample the spectra to an array with equal width in log A and 
subtract the median spectrum off. (S3) Same as S2 except 
that the spectra are now changed to vF v rather than F v . The 
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Figure 1. Upper panel: The mean spectrum of all objects in our 
sample and the first four PCs. The dashed vertical lines mark the 
central location of the 6.2, 7.7, 8.6, 11.2 and 12.7 fim PAH emis- 
sions. The thin vertical lines indicate the location of the molecular 
hydrogen lines at 9.66, 12.28 and 17.03 fim. The dotted vertical 
lines indicate the positions of the neon fine-structure lines, [Ne II] 
12.8, [Ne v] 14.3 and [Ne III] 15.6 fim. We have chosen to show 
PCs with PAH features in emission. The sign of each PC is arbi- 
trary. Lower panel: A closer look at the 5-15 fim region of the 
mean spectrum and the four PCs. 



Figure 2. The mean / median spectrum of all objects in our 
sample and the first four PCs derived using different set-ups of the 
analysis. The black curves correspond to the mean spectrum and 
the first four PCs under (SI) (see text for definitions of different 
set-ups). The blue curves correspond to the median spectrum 
and the derived PCs under (S2). The red curves correspond to 
the median spectrum and the derived PCs under (S3). 

different set-ups essentially give different effective weighting 
to different parts of the spectra. In Fig. [2] we compare the 
mean / median spectrum of our sample and the first four 
PCs derived under different set-ups. We find that although 
there are noticeable changes, the qualitative features in the 
mean / median spectrum and the derived PCs remain the 
same. For example, PCI is characterised by weak PAH fea- 
tures, strong silicate absorption and steep spectral slope un- 
der different set-ups. We also note that there are almost no 
line features in PC4 under (S3). However, given that PC4 
is not a proper PC and it is not needed to reconstruct the 
spectra for the majority of our objects, we conclude that the 
differences in the derived mean / median spectrum and the 
major PCs are immaterial to our results presented in the 
following sections. 

3.3 A stability study: a bootstrap approach 

We use a bootstrap methocjf] on the original dataset to study 
the stability of the mean spectrum and the PCs. We repeat 
the steps in Section 3.2 on each bootstrap realisation. Fig.0 

2 Each bootstrap realisation is generated by sampling the original 
dataset with replacement 119 times. As a result, some objects 
might be sampled more than once and some object in the original 
dataset might be absent in a particular realisation. 
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Figure 3. The 1 — cr uncertainty range in the mean spectrum 
and the first four PCs (from top to bottom) from 100 bootstrap 
realisations of the original sample. The mean spectrum stays more 
or less the same. 

shows the 1 — a region of the mean spectrum and the first 
four PCs using 100 realisations. The mean spectrum is very 
stable. In PCI, the stable features are the steep spectral 
slope and broad silicate absorption. The former implies a 
large amount of cold dust and the latter a large dust col- 
umn density. PAH features vary from weak to nearly absent 
suggesting that PAHs are not important. If there is a nega- 
tive PCI contribution, we will get weaker silicate absorption 
(or even silicate emission if it is sufficiently negative) and a 
flatter/bluer spectrum but it has almost nothing to do with 
PAH features. In PC2, we generally see strong PAH emis- 
sions and a flat overall spectral shape. If there is a negative 
PC2 contribution, then PAH features will be weakened but 
the spectral shape will not be altered much. In PC3, again 
we see a flat overall spectrum and strong PAHs. There is a 
broad trough centred at ~ 14fim. If PC3 is negative, then 
we will see a strong mid-infrared continuum indicating the 
presence of hot dust. 

3.4 Eigenvector decomposition 

We can decompose every spectrum in our sample by pro- 
jecting it onto the first four PCs, 

/ = Mean Spectrum + CI x PCI + C2 x PC2 

+C3 x PCS + C4 x PC4, (3) 

where C1,C2,C3 and C4 are the coordinates along PCI, 
PC2, PC3 and PC4 respectively. The histogram of contri- 
butions from each PC is plotted in Fig. [4] In each panel, 



O 




-10 -5 5 -10 -5 5 10 

PC contribution 



Figure 4. Histogram of contributions to each object in our sam- 
ple from each of the first four PCs, normalised so that the peak of 
the distribution is equal to one. The heavy curves are Gaussian 
fits to the histograms. The standard deviation of each Gaussian 
fit is shown in each panel. The dashed lines mark the 2 — cr range. 
A greater width of the Gaussian distribution indicates that more 
sources need a contribution from that particular PC. 

the distribution can be fitted by a Gaussian function, the 
width of which decreases from PCI to PC4 as expected. The 
width of PC4 is significantly narrower than the first three 
PCs meaning that the number of objects which need signifi- 
cant contributions from PC4 is small. Indeed, only about six 
objects need a large contribution from PC4. Four of these 
objects show silicate emission, one show deep silicate absorp- 
tion and no PAH emission and one show strong [NeV] indi- 
cating that PC4 is mainly an indicator of powerful AGNs. 
Indeed, in Fig. 1 and Fig. 2, we can see strong neon lines and 
strong silicate absorption when PC4 is positive and silicate 
emission when it is sufficiently negative. 



4 WHAT DOES EACH PC DO? 

4.1 Case studies using eigenvector decomposition 

Fig. [5] show spectral decomposition of eight ULIRGs, six 
of which are from the IRAS BGS sample. The starburst- 
dominated Arp 220 has a huge amount of cold dust around 
30 K. It has the largest positive PCI resulting in the reddest 
spectrum and very strong silicate absorption. PG 1211 is a 
quasar and it has the largest negative PCI. It confirms that 
PCI roughly indicates silicate absorption depth. F18030 has 
the largest positive PC2 and the largest positive PC3, con- 
sistent with the fact that F18030 has the strongest PAH 
emissions in all galaxies in our sample. F08572, optically 
classified as LINER, has the largest negative PC3 to atten- 
uate PAH emission. It has the bluest spectrum and very deep 
silicate absorption. Mrk 231 has a large negative PCI im- 
plying the presence of hot dust. It also has a large negative 




Figure 5. In each panel, the lower solid curve is the original spectrum and the four curves above it show the contributions from the first 
three PCs (black - PCI; red - PC2; green - PC3). The blue line represents the zero level. 
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Table 2. Eight types of ULIRG spectra. 



Type 


Definition 


Number of objects 


Tl 


PCI > 0, PC2 > 0, PC3 > 


17 


T2 


PCI > 0, PC2 > 0, PC3 < 


16 


T3 


PCI > 0, PC2 < 0, PC3 > 


13 


Tl 


PCI > 0, PC2 < 0, PC3 < 


17 


T5 


PCI < 0, PC2 > 0, PC3 > 


14 


T6 


PCI < 0, PC2 > 0, PC3 < 


11 


T7 


PCI < 0, PC2 < 0, PC3 > 


15 


T8 


PCI < 0, PC2 < 0, PC3 < 


16 



contribution from both PC2 and PC3 resulting in very weak 
PAHs and strong mid-infrared continuum. All of these sug- 
gest than it is AGN-dominated which is in agreement with 
its optically classification as Seyfert 1 (broad-line AGN). 
The spectrum of Mrk 273, mainly characterised by a large 
positive PCI and small amount of negative PC2, suggests 
that it is a less obscured version of Arp 220. 

4.2 Spectral classification using PCs 

In Section 4.1, we have looked at the effect of PCs on indi- 
vidual galaxies. But what is the general effect of each PC? 
Given that only the first three PCs are needed to recon- 
struct the majority of our spectra, it seems natural that we 
divide our sample into eight types according to sign of the 
contribution from each of the first three PC. In Tabled we 
have listed the definitions of the eight types and the number 
of objects in each type. 

In the top panel in Fig. [(J we have grouped the mean 
spectra of the eight types into four groups each of which con- 
tains two types. The only difference between the two spectra 
in each group is the sign of PCI. For example, the pair at 
the bottom of Fig.[S]shows the mean spectra of objects in Tl 
and T5. Both types have positive PC2 and positive PC3 but 
opposite signs of PCI. We can see that while PAH features 
stay more or less the same in each pair, silicate absorption 
depth as well as the continuum shape changes. Similarly, in 
Fig. [6] we have grouped the mean spectra into four groups 
with the only difference in each group being the sign of PC2. 
In each group, the red curve has weaker PAH emissions than 
the black curve while silicate absorption and the continuum 
shape remain more or less unchanged. Lastly, in the bottom 
panel in Fig. [6] the only difference in the mean spectra in 
each group is the sign of PC3. Galaxy types with PC3< 
have weaker PAH emission and stronger silicate absorption. 



5 COMPARISON WITH OTHER 
DIAGNOSTICS 

5.1 Comparison with the /30//15 continuum ratio 

Veilleux et al. (2009) presented results from Spitzer IRS 
observations of 74 ULIRGs and 34 Palomar Green (PG) 
quasars in the local Universe (z < 0.3). They find that the 
/30//15 continuum ratio can be used as a surrogate of the 
PAH-free, silicate-free MIR/FIR ratio to search for AGN ac- 
tivity. There are 59 sources in their analysis which are also 
included in our sample. In Fig. [Jj we compare the contribu- 
tions of PCI in these sources with the /30//15 continuum 
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Figure 6. Top: The mean spectra of the eight types, shown in 
four groups so that the only difference in each group is the sign 
of PCI (black line - PCI > 0; red line - PCI < 0). Adjacent 
spectra are normalised at 20 fim. Middle: The only difference in 
each group is the sign of PC2 (black line - PC2 > 0; red line - 
PC2 < 0). Bottom: The only difference in each group is the sign 
of PC3 (black line - PC3 > 0; red line - PC3 < 0). 
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Figure 8. The total contribution of PC2 and PC3 versus the 
PAH 6.2fim equivalent width (EW6.2) for all objects in our sam- 
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Scyfcrt 1; small filled circles - everything else). We divide our ob- 
jects into four bins of the total contribution of PC2 and PC3, 
which are indicated by the vertical dotted lines. The empty stars 
represent the median values of the EW6.2 for all galaxies in each 
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EW6.2 excluding all Seyfert 1 type galaxies. 
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Figure 9. The average spectra for eight classes defined in Spoon 
et al. (2006) and the reconstructed spectrum using the first three 
(red line) and the first four (blue line) principal components. The 
black curve represents the original spectra of the eight classes. 



indicate that PCI contribution does not correspond directly 
to AGN contribution in objects where the dominant power 
source is star formation. 



5.2 Comparison with PAH 6.2 (im equivalent 
width 

Another commonly used diagnostic for star formation activ- 
ity is the PAH 6.2 equivalent width (EW6.2). In Fig. U 
we compare the total contribution of PC2 and PC3 with 
EW6.2 for all galaxies in our sample. In general, objects 
with large EW6.2 have large positive (PC2+PC3) contribu- 
tion and objects with negligible EW6.2 have large negative 
(PC2+PC3) contribution. A few galaxies with optical spec- 
tral type classified as Seyfert 1 have near zero PAH emission 
at 6.2 fim but positive (PC2+PC3) contribution. This is be- 
cause in these objects, there are large negative contributions 
from PCI which results in negative PAH features. So, a pos- 
itive total contribution from PC2 and PC3 is used to cancel 
out these negative PAH features. 



ratio. A good correlation between PCI contribution and the 
/30//15 continuum ratio can be seen in the plot. ULIRGs 
near the zero point for pure AGN (log 10(/3o//is) = 0) have 
large negative PCI values. On the other hand, ULIRGs near 
the zero point for pure starburst (log 10(/3o//is) = 0) have 
large positive PCI values. The scatter in this correlation 
seems to increase as PCI contribution increases which could 



5.3 Comparison with the Spoon IRS diagnostic 
diagram 

Spoon et al. (2007) used the equivalent width (EW) of the 
6.2 jj,m PAH feature and the optical depth of the 9.7 fim sil- 
icate feature as a new diagnostic tool to divide galaxies into 
various classes ranging from continuum-dominated AGN 
hot dust spectra and PAH-dominated starburst spectra to 
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absorption-dominated spectra of deeply obscured galactic 
nuclei. Fig. [9] shows the mean spectra of the eight classes 
defined in Spoon et al. (2007) and the reconstructed spectra 
using the first three or four PCs. Clearly, the first three PCs 
already provide an accurate reconstruction for all classes. 
Fig. [10] shows the contribution of each PC to each class. 

The class 1A spectrum is a nearly featureless hot dust 
continuum with very weak silicate absorption centred at 
9.7 nm. It is mainly composed of a large negative PCI and 
a large negative PC2. The former results in weak silicate 
absorption and a blue spectrum and the latter causes weak 
PAHs. The class IB spectrum clearly shows PAH emission 
in addition to a hot dust continuum. Its spectral decompo- 
sition is similar to that of the class 1A, however, the class 
IB also requires a large positive PC3 to show strong PAH 
features. 

In the class 1C, we see strong PAH emissions caused 
by large positive PC3. The class 2C spectrum is similar to 
class 1C apart from stronger silicate absorption and steeper 
20-30 )im continuum. Accordingly, the class 2C has a large 
positive PC2, positive PC3 and positive PCI. 

The class 2B spectrum has weaker PAHs than class 2C. 
It has a large positive PCI, positive PC3 and negative PC2. 
The class 3B has stronger silicate absorption than 2B. It 
has a positive PCI and negative PC3. A negative PC3 con- 
tributes most to the class 3A which has the maximum sil- 
icate absorption. It confirms that a negative contribution 
from PC3 both reduces PAH emission and deepens silicate 
absorption. The class 2A, dominated by a large negative 
PC3 and PCI, has weaker silicate absorption than 3A. 

In Spoon et al. (2007), galaxies are found to distribute 
along two branches in the plane defined by the 6.2 fim PAH 
EW and the 9.7 fim silicate absorption strength. For galax- 
ies with strong star formation, we expect to see a positive 
total contribution from PC2 and PC3. On the other hand, 
for galaxies with weak or no PAH features, the total con- 
tribution from PC2 and PC3 should be negative. In other 
words, the sum of PC2 and PC3 can serve as an approx- 
imate measure of the intensity of star formation activity. 
The contribution from PCI corresponds to the strength of 
silicate absorption. In Fig. 111! we plot the sum of PC2 and 
PC3 against PCI where a pattern similar to the fork dia- 
gram presented in Spoon et al. (2007) is seen. The horizon- 
tal branch in the fork diagram in Spoon et al. (2007), from 
1A, IB to 1C with increasing PAH equivalent width at 6.2 
\im roughly corresponds to the diagonal line going from the 
bottom left corner to the top right corner. Similarly, the di- 
agonal branch in the fork diagram, from 3A, 2B to 1C with 
decreasing silicate absorption strength and increasing PAH 
equivalent width at 6.2 fim, seems to roughly correspond to 
the diagonal line going from the top left corner to the bot- 
tom right corner. We emphasise that Fig. [TJJ does not show 
a complete fork diagram. To see the horizontal fork (1A-1B- 
1C) in full glory, the non-ULIRG AGN sample needs to be 
included as well. We also note that there are more transition 
sources showing up in Fig.[TTJcompared to the fork diagram. 
This is because we have included sources from other Spitzer 
ULIRG programs besides the GTO sources used in Spoon 
et al. (2007). 



5.4 AGN contribution to bolometric luminosity 

Nardini et al. (2008; 2009) analysed the contribution AGN 
and starbursts to the bolometric luminosity based on a 5-8 
[mi region of the spectra. They found evidence for AGN ac- 
tivity for ~70% of their sample which consists of 71 ULIRGs 
at z < 0.15. 55 objects in their sample are found in our 
dataset. The AGN emission is modelled as a featureless 
power law fx oc A 1 ' 5 with an exponential attenuation e~ T ' A \ 
where the optical depth is supposed to follow r(A) oc A -1 ' 75 . 
The SB template is built upon five brightest pure starbursts. 
In the left panel of Fig. 1121 we have plotted the contribution 
from PC3 as a function of Nardini et al.'s estimate of the 
AGN contribution to the bolometric infrared luminosity. We 
can see a clear correlation between contributions from PC3 
and abol- R supports the idea that a negative contribution 
from PC3 turns on AGN activity as discussed in the above 
sections. 

5.5 Optical spectral type 

Fig. [13] shows the distribution of objects in the PC1-PC4 
plane, colour-coded by their optical classifications including 
HII, LINER, Seyfert 1 and Seyfert 2. Clearly, this distribu- 
tion is non-Gaussian which means 'PC4' can not be inter- 
preted as a proper PC. However, we can say that the few 
spectra which need large PC4 contributions are outside the 
parameter space spanned by the first three PCs. The most 
obvious feature in Fig. [T3] is that almost all Seyfert 1 type 
galaxies have a large negative PC4 contribution while ob- 
jects of other types have a mean zero PC4 contribution with 
a small scatter. It shows again that PC4 has a remarkably 
good correspondence with optical QSOs. 



6 DISCUSSION AND CONCLUSION 

This is the first time to our knowledge that principal compo- 
nent analysis has been applied to mid-infrared spectra. It is 
a simple yet powerful analysis tool. The first principal com- 
ponent PCI mainly constrains the dust temperature and the 
geometry of the distribution of source and dust. Both the 
second and the third principal component, PC2 and PC3, 
seem to regulate the intensity of star formation activity. In 
addition, a large negative contribution from the third princi- 
pal component corresponds to a brightening AGN. For a few 
sources with spectral features indicating a dominant AGN 
(e.g. silicate emission, strong [NeV]), the first three principal 
components are not enough in order to accurately reproduce 
the observed spectra and a fourth principal component is re- 
quired. 

Using principal component analysis, we are effectively 
finding an orthogonal basis to describe the variance which 
is assumed to fully characterise the objects under study. So 
the question is what mechanism/physics is causing spectra 
to vary. Are spectral differences related to different evolu- 
tionary stages of the ULIRG population? If so, by adding a 
positive or negative contribution from each PC to adjust the 
relative strength of various features (e.g. PAH emission, sil- 
icate absorption, spectral slope), the evolutionary stages of 
ULIRGs are modified. In this paper, we have compared our 
principal component analysis with various other diagnostics 
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Figure 10. The average spectrum of each of the classes (the lower curve in each panel) and the contribution of each of the first three 
PCs (black - PCI; red - PC2; green - PC3). The blue line represents the zero level. 

(such as the /30 //15 continuum ratio, the PAH 6.2 /im equiv- investigated using radiative transfer models in a future pa- 
alent width and 9.7 fim silicate absorption strength). There per. 
are some tentative evidence that the principal components 
are linked to the evolutionary stages. The effect of each PC 
on the evolution of the ULIRG population will be further 
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Figure 11. PCI versus the sum of PC2 and PC3. The red dots 
represent Spoon classes and the open circles represent our sam- 
ple. The diagonal lines and the arrow (to represent the effect of 
continuum dilution) are used to indicate the similarities between 
this plot and the fork diagram presented in Spoon et al. (2007). 
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Figure 13. Contributions from PC4 versus that from PCI, 
colour-coded by optical classes (blue dots - HII; green dots - 
LINER; yellow dots - Seyfert 2; red dots - Seyfert 1; black dots - 
unknown optical classes). 
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